Evaluation of anchoring schemes for fast DNA sequence alignments
نویسندگان
چکیده
Comparative genomics makes an extensive use of sequence comparison to identify genes and their possible functions. A whole genome search needs to be pruned by a heuristic (the anchoring step) to ensure a reasonable execution time. We explored anchoring schemes that could solve the classical trade-off between speed and quality: speed is obtained by generating less anchors (thus decreasing the costly post-processing) with the risk of missing significant alignments. We propose to use dedicated hardware to solve this problem. Actually, this study started as a preliminary for the implementation of anchoring algorithms into a prototype hardware filter currently under development in our team [8].
منابع مشابه
Accurate anchoring alignment of divergent sequences
MOTIVATION Obtaining high quality alignments of divergent homologous sequences for cross-species sequence comparison remains a challenge. RESULTS We propose a novel pairwise sequence alignment algorithm, ACANA (ACcurate ANchoring Alignment), for aligning biological sequences at both local and global levels. Like many fast heuristic methods, ACANA uses an anchoring strategy. However, unlike ot...
متن کاملgpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملHybrid DNA Sequence Similarity Scheme for Training Support Vector Machines
Similarity between two DNA sequences is based on alignment. There are different approaches of alignments; each has its own specialty of bearing different information on DNA sequence. This paper presents a study on similarity kernels based on different similarity schemes and proposes a hybrid one. Similarity Kernel is required in order to represent the distance or similarity between two DNA sequ...
متن کاملRealigner: a Program for Reening Dna Sequence Multi-alignments
We present a round-robin realignment algorithm that improves a potentially crude initial alignment of an assembled collection of DNA sequence fragments, as might, for example, be output by a typical fragment assembly program. The algorithm uses a weighted combination of two scoring schemes to achieve superior multi-alignments, and employs a banded dynamic programming variation to achieve a runn...
متن کاملDetecting recombination in 4-taxa DNA sequence alignments with Bayesian hidden Markov models and Markov chain Monte Carlo.
This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian app...
متن کامل